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ABSTRACT 

Solving the Euler equations of ideal hydrodynamics as accurately and efficiently as possible is 
a key requirement in many astrophysical simulations. It is therefore important to continuously 
advance the numerical methods implemented in current astrophysical codes, especially also 
in light of evolving computer technology, which favours certain computational approaches 
over others. Here we introduce the new adaptive mesh refinement (AMR) code TENET, which 
employs a high order discontinuous Galerkin (DG) scheme for hydrodynamics. The Euler 
equations in this method are solved in a weak formulation with a polynomial basis by means 
of explicit Runge-Kutta time integration and Gauss-Legendre quadrature. This approach of¬ 
fers significant advantages over commonly employed second order finite volume (EV) solvers. 
In particular, the higher order capability renders it computationally more efficient, in the sense 
that the same precision can be obtained at significantly less computational cost. Also, the DG 
scheme inherently conserves angular momentum in regions where no limiting takes place, 
and it typically produces much smaller numerical diffusion and advection errors than a EV 
approach. A further advantage lies in a more natural handling of AMR refinement bound¬ 
aries, where a fail-back to first order can be avoided. Einally, DG requires no wide stencils 
at high order, and offers an improved data locality and a focus on local computations, which 
is favourable for current and upcoming highly parallel supercomputers. We describe the for¬ 
mulation and implementation details of our new code, and demonstrate its performance and 
accuracy with a set of two- and three-dimensional test problems. The results confirm that DG 
schemes have a high potential for astrophysical applications. 
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1 INTRODUCTION 

Through the availability of ever more powerful computing re¬ 
sources, computational fluid dynamics has become an important 
part of astrophysical research. It is of crucial help in shedding light 
on the physics of the baryonic part of our Universe, and contributes 
substantially to progress in our theoretical understanding of galaxy 
formation and evolution, of star and planet formation, and of the 
dynamics of the intergalactic medium. In addition, it plays an im¬ 
portant role in planetary science and in solving engineering prob¬ 
lems related to experimental space exploration. 

In the astrophysics community, most numerical work thus far 
on solving the Euler equations of ideal hydrodynamics has been 
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carried out with two basic types of codes. On the one hand, there is 
the broad class of Eulerian methods which utilize classical hydro- 
dynamic solvers operating on fixed Cartesian grids (e.g. Stone et al. 
2008), or on meshes which can adjust their resolution in space with 
the adaptive mesh refinement (AMR) technique (e.g. Fryxell et al. 
2000; Teyssier 2002; Mignone et al. 2007; Bryan et al. 2014). On 
the other hand, there are pseudo-Lagrangian discretizations in the 
form of smoothed particle hydrodynamics (SPH), which are other 
flexible and popular tools to study many astrophysical problems 
(e.g. Wadsley et al. 2004; Springel 2005). 

Some of the main advantages and drawbacks of these meth¬ 
ods become apparent if we recall the fundamental difference in 
their numerical approach, which is that grid codes discretize space 
whereas SPH decomposes a fluid in terms of mass elements. The 
traditional discretization of space used by Eulerian methods yields 
good convergence properties, and a high accuracy and efficiency 
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for many problems. Furthermore, calculations can be straightfor¬ 
wardly distributed onto parallel computing systems, often allowing 
a high scalability provided there is only little communication be¬ 
tween the cells. The discretization of mass used by SPH on the 
other hand results in a natural resolution adjustment in converg¬ 
ing flows such that most of the computing time and available res¬ 
olution are dedicated to dense, astrophysically interesting regions. 
Moreover, Lagrangian methods can handle gas with high advection 
velocities without suffering from large errors. However, both meth¬ 
ods have also substantial weaknesses, ranging from problems with 
the Galilean invariance of solutions in the case of grid codes, to a 
suppression of fluid instabilities and noise in the case of SPH. 

This has motivated attempts to combine the advantages of tra¬ 
ditional grid-based schemes and of SPH in new methods, such as 
moving mesh-codes (Springel 2010; Duffell & MacFadyen 2011) 
or in mesh-free methods that retain a higher degree of accuracy 
(Lanson & Vila 2008; Hopkins 2014) than SPH. Both of these new 
developments conserve angular momentum better than plain Eule- 
rian schemes, while still capturing shocks accurately, a feature that 
is crucial for many applications in astrophysics. However, these 
new methods need to give up the simple discretization of space 
into regular grids, meaning also that their computational efficiency 
takes a significant hit, because the regular memory access patterns 
possible for simple structured discretizations are ideal for leverag¬ 
ing a high fraction of the peak performance of current and future 
hardware. 

In fact, the performance of supercomputers has increased ex¬ 
ponentially over the last two decades, roughly following the em¬ 
pirical trend of Moore’s law, which states that the transistor count 
and hence performance of computing chips doubles roughly every 
two years. Soon, parallel computing systems featuring more than 
10 million cores and “exascale” performance in the range of lO'* 
floating point operations per second are expected. Making full use 
of this enormous compute power for astrophysical research will re¬ 
quire novel generations of simulation codes with superior paral¬ 
lel scaling and high resilience when executed on more and more 
cores. Also, since the raw floating point speed of computer hard¬ 
ware grows much faster than memory access speed, it is imperative 
to search for new numerical schemes that focus on local computa¬ 
tions within the resolution elements. Those allow for data locality 
in the memory leading to a fast data transfer to the CPU cache, and 
furthermore, the relative cost for communication with neighbour¬ 
ing resolution elements is reduced. 

A very interesting class of numerical methods in this context 
are so-called discontinuous Galerkin (DG) schemes, which can be 
used for a broad range of partial differential equations. Since the 
introduction of DG (Reed & Hill 1973) and its generalization to 
nonlinear problems (Cockbum & Shu 1989, 1991, 1998; Cockburn 
et al. 1989, 1990), it has been successfully applied in diverse fields 
of physics such as aeroacoustics, electro magnetism, fluid dynam¬ 
ics, porous media, etc. (Cockbum et al. 2011; Gallego-Valencia 
et al. 2014). On the other hand, in the astrophysics community 
the adoption of modern DG methods has been fairly limited so far. 
However, two recent works suggest that this is about to change. 
Mocz et al. (2014) presented a DG method for solving the mag¬ 
netohydrodynamic (MHD) equations on arbitrary grids as well as 
on a moving Voronoi mesh, and Zanotti et al. (2015) developed an 
AMR code for relativistic MHD calculations. The present paper is 
a further contribution in this direction and aims to introduce a novel 
DG-based hydrodynamical code as an alternative to commonly em¬ 
ployed schemes in the field. 

DG is a finite element method which incorporates several as¬ 


pects from finite volume (FV) methods. The partial differential 
equation is solved in a weak formulation by means of local basis 
functions, yielding a global solution that is in general discontinuous 
across cell interfaces. The approach requires communication only 
between directly neighbouring cells and allows for the exact con¬ 
servation of physical quantities including angular momentum. Im¬ 
portantly, the method can be straightforwardly implemented with 
arbitrary spatial order, since it directly solves also for higher or¬ 
der moments of the solution. Unlike in standard FV schemes, this 
higher order accuracy is achieved without requiring large spatial 
stencils, making DG particularly suitable for utilizing massive par¬ 
allel systems with distributed memory because of its favourable 
compute-to-communicate ratio and enhanced opportunities to hide 
communication behind local computations. 

In order to thoroughly explore the utility of DG in real astro- 
physical applications, we have developed TENET, an MPI-parallel 
DG code which solves the Euler equations on an AMR grid to ar¬ 
bitrary spatial order. In our method, the solution within every cell 
is given by a linear combination of Legendre polynomials, and the 
propagation in time is accomplished with an explicit Runge-Kutta 
(RK) time integrator. A volume integral and a surface integral have 
to be computed numerically for every cell in every timestep. The 
surface integral involves a numerical flux computation which we 
carry out with a Riemann solver, similar to how this is done in 
standard Godunov methods. In order to cope with physical discon¬ 
tinuities and spurious oscillations we use a simple minmod limiting 
scheme. 

The goal of this paper is to introduce the concepts of our 
DG implementation and compare its performance to a standard EV 
method based on the reconstruct-solve-average (RSA) approach. In 
this traditional scheme, higher order information is discarded in the 
averaging process and recomputed in the reconstruction step, lead¬ 
ing to averaging errors and numerical diffusion. DG on the other 
hand does not only update the cell-averaged solution every cell, but 
also higher order moments of the solution, such that the reconstruc¬ 
tion becomes obsolete. This aspect of DG leads to important ad¬ 
vantages over EV methods, in particular an inherent improvement 
of angular momentum conservation and much reduced advection 
errors, especially for but not restricted to smooth parts of the so¬ 
lutions. These accuracy gains and the prospect to translate them 
to a lower computational cost at given computational error form 
a strong motivation to investigate the use of DG in astrophysical 
applications. 

The paper at hand is structured as follows. In Section 2, we 
present the general methodology of our DG implementation for 
Cartesian grids. The techniques we adopt for limiting the numer¬ 
ical solution are described in Section 3, and the generalization to 
a mesh with adaptive refinement is outlined in Section 4. These 
sections give a detailed account of the required equations and dis¬ 
cretization formulae for the sake of clarity and definiteness, some¬ 
thing that we hope does not discourage interested readers. We then 
validate our DG implementation and compare it to a standard sec¬ 
ond order Godunov FV solver with two- and three-dimensional test 
problems in Section 5. Finally, Section 6 summarizes our findings 
and gives a concluding discussion. 

2 DISCONTINUOUS GALERKIN HYDRODYNAMICS 
2.1 Euler equations 

The Euler equations are conservation laws for mass, momentum, 
and total energy of a fluid. They are a system of hyperbolic partial 
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differential equations and can be written in compact form as 
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The unknown quantities are density p, velocity v, pressure p, and 
total energy per unit mass e. The latter can be expressed in terms 
of the internal energy per unit mass u and the kinetic energy of the 
fluid, e = u + For an ideal gas, fhe system is closed with the 
equation of state 


p=pu(y- 1), 


(4) 


where y denotes the adiabatic index. 


2.2 Solution representation 

We partition the domain by non-overlapping cubical cells, which 
may be refined using AMR techniques as explained in later sec¬ 
tions. Moreover, we follow the approach of a classical modal DG 
scheme, where the solution in cell K is given by a linear combina¬ 
tion of N(k) orthogonal and normalized basis functions (l>f: 

N(k) 

u‘^(x,t] = YjWf(l)4,f(x). (5) 

1=1 

In this way, the dependence on time and space of the solution is 
split into time-dependent weights, and basis functions which are 
constant in time. Consequently, the state of a cell is completely 
characterized by N(k) weight vectors wj(t). 

The above equation can be solved for the weights by multiply¬ 
ing with the corresponding basis function and integrating over 
the cell volume. Using the orthogonality and normalization of the 
basis functions yields 

= iPi r= 1, ■ ■ ■. mx (6) 

1^1 Jk 

where lA"! is the volume of the cell. The first basis function is chosen 
to be = 1 and hence the weight trf is the cell average of the 
state vector . The higher order moments of the state vector are 
described by weights with j > 2. 

The basis functions can be defined on a cube in terms of scaled 
variables 

0,(^):[-l,l]'^R. (7) 

The transformation between coordinates ^ in the cell frame of ref¬ 
erence and coordinates x in the laboratory frame of reference is 

( 8 ) 


where Ax^ and x^ are edge length and cell centre of cell K, re¬ 
spectively. For our DG implementation, we construct a set of three- 
dimensional polynomial basis functions with a maximum degree of 
k as products of one-dimensional scaled Legendre polynomials P\ 

= [Pu{^dPv{^ 2 )PJki)\u, V, w € No A n -H V -H w < k}. 

(9) 

The first few Legendre polynomials are shown in Appendix A. The 
number of basis functions for polynomials with a maximum degree 
of k is 

k k-u k-ii-v 

Nik)^ zzz I = ^(k+l)(k + 2)(k + 3). (10) 

if=0 v=0 vi-0 

Furthermore, when polynomials with a maximum degree of k are 
used, a scheme with spatial order p = k-t 1 is obtained. For example, 
linear basis functions lead to a scheme which is of second order in 
space. 


2.3 Initial conditions 

Given initial conditions u(x, t = 0) = u(x, 0), we have to provide an 
initial state for the DG scheme which is consistent with the solution 
representation. To this end, the initial conditions are expressed by 
means of the polynomial basis on cell K, which will then be 

N(k) 

uUx,0) = J]wfmfix). (11) 

/=! 

If the initial conditions at hand are polynomials with degree < k, 
this representation preserves the exact initial conditions, otherwise 
equation (11) is an approximation to the given initial conditions. 
The initial weights can be obtained by performing an L^-projection, 

min j (uf(x,0) - Uj(x,0)] dV, (12) 

Jk 

where i = 1, • • ■. 5 enumerates the conserved variables. The projec¬ 
tion above leads to the integral 

<(0)= 7^ r«(A:,0)<^f(x)dV; j=h...,N{k), (13) 

1^1 Jk 

which can be transformed to the reference frame of the cell, viz. 

H'f(0)=5r «(f,0)</.,(f)d^, j=\,...,N(k). (14) 

We solve the integral numerically by means of tensor product 
Gauss-Legendre quadrature (hereafter called Gaussian quadrature) 
with (k+ 1)^ nodes: 

(*+iF 

<(0)^ g Z j=l,...,N(k). (15) 

q=l 

Here, is the position of the quadrature node q in the cell frame 
of reference and tu® denotes the corresponding quadrature weight. 
The technique of Gaussian quadrature is explained in more detail 
in Appendix B. 


2.4 Evolution equation for the weights 

In order to derive the DG scheme on a cell K, the Euler equations 
for a polynomial state vector are multiplied by the basis function 
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o cell quadrature points 
• face quadrature points 


term of the evolution equation (17). This is because the solution is 
discontinuous across cell interfaces. We hence have to introduce a 
numerical flux function f ,h^, which in general depends 

on the states left and right of the interface and on the normal vector. 
With this numerical flux, the third term in equation (17) takes the 


form 

V r 


(Av^)2 


/„n„(x)0f (x) d5 
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(Ax^) 

4 


r /(«^-(^, t), 0 , m) ^j(^) 


.K\2 <*+h' 




Aefl[-l,l]3 <7=1 


( 20 ) 


Figure 1. In the DG scheme, a surface integral and a volume integral have to 
be computed numerically for every cell, see equation (17). In our approach, 
we solve these integrals by means of Gaussian quadrature (Appendix B). 
This example shows the nodes for our third order DG method (with sec¬ 
ond order polynomials) when used in a two-dimensional configuration. The 
black nodes indicate the positions where the surface integral is evaluated, 
which involves a numerical flux calculation with a Riemann solver. The 
white nodes are used for numerically estimating the volume integral. 


0 ^ and integrated over the cell volume, 


r ^ 

Jk 



dh 

dXa 


<^fdV = 0. 


(16) 


Integration by parts of the flux divergence term and a subsequent 
application of Gauss’ theorem leads to 




^ r 


/„n„<^fd5 =0, 


(17) 


where h = («i, 712 ,^ 3 )^ denotes the outward-pointing unit normal 
vector of the surface dK. In the following, we discuss each of the 
terms separately. 

According to equation ( 6 ), the first term is simply the time 
variation of the weights, 

H r dw^ 

,18, 

The second and third terms are discretized by transforming the in¬ 
tegrals to the cell frame and applying Gaussian quadrature (Fig. 1, 
Appendix B). With this procedure the second term becomes 


()’=1 ^=1 




d^ 


d^a 


(19) 


Note that the transformation of the derivative gives a factor 

of 2, see equation ( 8 ). The volume integral is computed by Gaussian 
quadrature with k + \ nodes per dimension. These nodes allow the 
exact integration of polynomials up to degree 21 : + 1 . 

The flux functions in the above expression can be evalu¬ 
ated analytically, which is not the case for the fluxes in the last 


Here for each interface of the normalized cell, a two-dimensional 
Gaussian quadrature rule with {k+\)^ nodes is applied. The numer¬ 
ical flux across each node can be calculated with a one-dimensional 
Riemann solver, as in ordinary Godunov schemes. For our DG 
scheme, we use the positivity preserving HLLC Riemann solver 
(Toro 2009). 

In order to model physics extending beyond ideal hydrody¬ 
namics, source terms can be added on the right-hand side of equa¬ 
tion (16). Most importantly, the treatment of gravity is accom¬ 
plished by the source term 


0 


s = 


-pVO 
-pv • VO, 


( 21 ) 


which by projecting onto the basis function (jtj inside cell K and 
discretizing becomes 

r s{x,t)(p'^{x)d.V 
Jk 

° -'[-I.IF 
in 

(22) 

<7=1 

We have now discussed each term of the basic equation (17) 
and arrived at a spatial discretization of the Euler equations of the 
form 

-^+/fx = 0, l,...,A(fe), (23) 

which represents a system of coupled ordinary differential equa¬ 
tions. For the discretization in time we apply an explicit strong sta¬ 
bility preserving Runge-Kutta (SSP RK) scheme (Gottlieb et al. 
2001) of the same order as the spatial discretization. With a com¬ 
bination of an SSP RK method, a positivity preserving Riemann 
solver, and a positivity limiter (see Section 3.4), negative pres¬ 
sure and density values in the hydro scheme can be avoided. The 
Butcher tables of the first- to fourth order SSP RK methods we use 
in our code are listed in Appendix D. 


2.5 Timestep calculation 

If the Euler equations are solved with an explicit time integrator, the 
timestep has to fulfil a Courant-Friedrichs-Lewy (CFL) condition 
for achieving numerical timestep stability. For the DG scheme with 
explicit RK time integration, the timestep depends also on the order 
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p = k + \ of the scheme. We calculate the timestep of cell K 
following Cockburn & Shu (1989) as 


CFL / |vf I + |vf I + |vf I + \ ' 

+ 1 \ Ax^ Ax^ / 


(24) 


where c = ^jyplp is the sound speed and the denominators in the 
parantheses are the edge lengths of the cell. Since we used only cu¬ 
bic cells, we have Axf^ = Axj = Ax|^ = Ax^. In this work, we ap¬ 
ply a global timestep given by the minimum of the local timesteps 
among all cells. The CFL number depends on the choice of flux 
calculation but in practice should be set to a somewhat smaller 
value as formally required. This is because for the calculation of 
the timestep in an RK method of second order or higher, a mathe¬ 
matical inconsistency arises. At the beginning of the timestep, only 
the current velocity and sound speed of the gas are known, whereas 
in principle the maximum speed occurring during all RK stages 
within one step should be used for calculating the CFL timestep to 
be strictly safe. This information is not readily available, and the 
only straightforward way to cope with this problem is to reduce the 
CFL number to a value smaller than mathematically required. For 
the tests presented in this paper, we decided to adopt a conservative 
choice of CFL = 0.2. 

Furthermore, if the positivity limiter is used, the hydro 
timestep Ar^ has to be modified and can be slightly more restric¬ 
tive, as described in Section 3.4. 

Source terms s(u, t) on the right hand side of the Euler equa¬ 
tions can induce additional timestep criteria. In this case, positivity 
of the solution can be enforced by determining the timestep such 
that p(m') > 0 and piu') > 0, with u' = u + 2s(u,t)At (Zhang 
2006). For the gravity source term (21), this leads to the timestep 
limit 


Ar < 

“'gray - 


yjljiy - 1) l^®l 


(25) 


The actual timestep has then to be chosen as the minimum of the 
hydrodynamical and gravity timesteps. 


2.6 Angular momentum conservation 


A welcome side effect of the computation of higher order moments 
of the solution in DG schemes is that angular momentum is inher¬ 
ently conserved. Without loss of generality, we show this conser¬ 
vation property explicitly for the two-dimensional case {z = 0). In 
this case, the angular momentum density is defined as 


L = xpVy - ypVj-, 


(26) 


and the flux momentum tensor in 2D is given by 


//l.2 

/2,2'l 

_ lp+pv\ 

pViVz \ 

1/1.3 

/2,3/ 

\ PVlV2 

p+pvlj 


(27) 


The conservation law for angular momentum can be conveniently 
derived from the Euler equations (1). Multiplying the x-momentum 
equation by y and the y-momentum equation by x, applying the 
product rule and subsequently subtracting the two equations gives 

+ ^(^/i,3 - yfi.i) + - yfi.i) = 0- ( 28 ) 

at ax ay 

In order to obtain the angular momentum conservation law on a cell 
basis, this can be integrated over element K, resulting in 


d 

dr 



-'^(/l,3U| +/2,3«2) -y(/l,2Ul +/2,2n2)dS - 0. 


(29) 


On the other hand, for a DG scheme with order p > 1, we can 
choose the test function to be = y in the x-momentum equation 
and (j)'^ = x 'm the y-momentum equation in the weak formulation 
(17) of the Euler equations: 


1 pviydF- 

r f2.2dv+ 

f f2yds 

= 0, 

(30) 

JK 

Jk 

JdK 



j' pv2xdV' - 

f A.3dV + 

Jk 

f fixds 

JdK 

= 0, 

(31) 


where and /g are the momentum components of the numerical 
flux function. By subtracting the above equations and using /i ^ = 
fi.i = pviV 2 we get the angular momentum equation of DG, viz. 

^ fLydV+ f (/3X-/2y)d5. (32) 

or Jk JdK 

This equation is consistent with the exact equation (29), and hence 
DG schemes of at least second order accuracy are angular mo¬ 
mentum conserving. However, there is one caveat to this inherent 
feature of DG. In non-smooth regions of the solution, a limiting 
scheme has to be applied, which can slightly modify the angular 
momentum within a cell and hence lead to a violation of manifest 
angular momentum conservation. This is also the case for the sim¬ 
ple limiters we shall adopt and describe in the subsequent section. 


3 SLOPE LIMITING 

The choice of the slope-limiting procedure can have a large ef¬ 
fect on the quality of a hydro scheme, as we will demonstrate in 
Section 5.2. Often different limiters and configurations represent a 
trade-off between dissipation and oscillations, and furthermore, the 
optimal slope limiter is highly problem dependent. Consequently, 
the challenge consists of finding a limiting procedure which de¬ 
livers good results for a vast range of test problems. Eor the DG 
scheme, this proves to be even more difficult. The higher order 
terms of the solution should be discarded at shocks and contact 
discontinuities if needed, while at the same time no clipping of ex¬ 
trema should take place in smooth regions, such that the full benefit 
of the higher order terms is ensured. In what follows, we discuss 
two different approaches for limiting the linear terms of the solu¬ 
tion as well as a positivity limiter which asserts non-negativity of 
density and pressure. 


3.1 Component-vrise limiter 

In order to reduce or completely avoid spurious oscillations, we 
have to confine possible over- and undershootings of the high order 
solution of a cell at cell boundaries compared to the cell average 
of neighbouring cells. For that purpose, the weights vr^., w^-, w^. 
which are proportional to the slopes in the x-, y-, and z-directions, 
respectively, are limited by comparing them to the difference of 
cell-averaged values, viz. 

= -^minmod ( V3 h'^,-,/3(w^; - ),/3(wff - wf,)), 

wl- = -^minmod ( V3 w5;,-,/3(w^; - ),y3(w"f - wfj)j , (33) 

w^. = -^minmod ( V3 h'^,-,/3(w'^; - wff ),je(wJ^f - wf,)). 

Here, vv^., w^., vi>^. are the new weights, 

denote cell neighbours in the directions west, east, south, north. 
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bottom, and top, respectively, and the minmod-function is defined 
as 

(jmin(|a|,|fo|,|c|) j = sign(fl) = sign(Z7) = sign(c) 
minmod(a, b,c) = < 

lO otherwise. 

(34) 

Each component of the conserved variables, i = 1,..., 5, is 
limited separately and hence treated independently. The V3-factors 
in equation (33) account for the scaling of the Legendre polyno¬ 
mial Pi(^), and the parameter p e [0.5,1] controls the amount of 
limiting. The choice of p = 0.5 corresponds to a total variation di¬ 
minishing (TVD) scheme for a scalar problem but introduces more 
diffusion compared to j8 = 1. The latter value is less restrictive and 
may yield a more accurate solution. 

If the limited weights are the same as the old weights, i.e. 

Wj . = vvf, = wfj, and wf. = wf., we keep the original compo¬ 
nent of the solution uf (x, t) of the cell. Otherwise, the component 
is set to 


uf (x, t) = wfj + + wfjcpf + wfj(l>f. 


(35) 


where terms with order higher than linear have been discarded. 
Alternative approaches to our simple limiter exist in the form of 
hierarchical limiters (e.g. Kuzmin 2014, and references therein). 
In these schemes the highest order terms are limited first before 
gradually lower order terms are modified one after another. In this 
way also higher order terms can be kept in the limiting procedure. 
Note that in equation (35) the cell-averaged values do not change, 
and thus the conservation of mass, momentum, and energy is un¬ 
affected. However, the limiter may modify the angular momentum 
content of a cell, implying that in our DG scheme it is manifestly 
conserved only in places where the slope limiter does not trigger. 
Clearly, it is a desirable goal for future work to design limiting 
schemes which can preserve angular momentum. 


3.2 Characteristic limiter 

An improvement over the component-wise limiting of the con¬ 
served variables can be achieved by limiting the characteristic vari¬ 
ables instead. They represent advected quantities, and for the Euler 
equations we can define them locally by linearizing about the cell- 
averaged value. 

The transformation matrices £.f, £.f, and £.f are formed from 
the left eigenvectors of the flux Jacobian matrix based on the mean 
values of the conserved variables of cell K, = wf. We list all ma¬ 
trices in Appendix E. The slopes of the characteristic variables can 
then be obtained by the matrix-vector multiplications cf = Jlfwf, 
cf - £.fwf, cf = £,-wf, where wf, wf and wf denote the slopes 
of the conserved variables in the x—, y—, and z-directions, respec¬ 
tively. The transformed slopes are limited as in Section 3.1 with the 
minmod-limiting procedure, viz. 

cf = -^mmmod['/3cf,p£f(wf-w'f'‘),P£f(wf'‘-wf)), 
cf = ^minmod ( '/3cf,p£f(wf - w\’‘),p£f{wf^ - tvf)), 
cf = -^minmod ( 'Scf,p£f{wf - wf‘^),p£f(w\'‘ - tvf)) . 

(36) 

If the limited slopes of the characteristic variables are identical to 
the unlimited ones (i.e. cf = cf, cf = cf, and cf = cf), the orig¬ 
inal solution is kept. Otherwise, the new slopes of the conserved 


variables are calculated with the inverse transformation matrices 
<Rf = {£fy\ Kf = (-Cf)^', and Kf = (£f)-‘ via wf = <Rfcf, 
wf = Kfcf, and wf = Hfcf. In this case, the higher order terms 
of the solution are set to zero and the limited solution in the cell 
becomes 


u'^ (x, t) = wf + wf(pf wf(/)f wf(f>f. (37) 

The difference between the component-wise limiting of the con¬ 
served variables (Section 3.1) and the limiting of the more natural 
characteristic variables is demonstrated with a shock tube simula¬ 
tion in Section 5.2. 


3.3 Total variation bounded limiting 


The limiters discussed so far can effectively reduce overshoot¬ 
ings and oscillations; however, they can potentially also trigger at 
smooth extrema and then lead to a loss of higher order information. 
Considering the goals of a higher order DG scheme, this is a severe 
drawback that can negatively influence the convergence rate of our 
DG code. In order to avoid a clipping of the solution at smooth ex¬ 
trema, the minmod limiter in Sections 3.1 and 3.2 can be replaced 
by a bounded version (Cockburn & Shu 1998), viz. 


minmodB(fl, b, c) 


a if |a| < M{£x‘^f 

minmod(a, b, c) otherwise. 


(38) 


Here, M is a free parameter which is related to the second deriva¬ 
tive of the solution. The ideal choice for it can vary for different test 
problems. Furthermore, in the above ansatz, the amount of limiting 
depends on the resolution since \a\ oc Ax^, which is not the case 
for the right hand side of the inequality. For reasons of simplicity 
and generality, we would however like to use fixed limiter param¬ 
eters without explicit resolution dependence for the tests presented 
in this paper. Hence we define M = MAx^, and use a constant 
value for M to control the strength of the bounding applied to the 
minmod limiter. With the minmod-bounded limiting approach, the 
high accuracy of the solution in smooth regions is retained while 
oscillations, especially in post-shock regions, can be eliminated. 


3.4 Positivity limiting 

When solving the equations of hydrodynamics for extreme flows, 
care has to be taken in order to avoid negative pressure or density 
values within the cells and at cell interfaces. A classical example is 
high Mach number flows, for which in the pre-shock region the to¬ 
tal energy is dominated by the kinetic energy. Because the pressure 
is calculated via the difference between total and kinetic energies, it 
can easily become negative in numerical treatments without special 
precautions. 

The use of an ordinary slope limiter tends to be only of limited 
help in this situation, and only delays the blow-up of the solution. In 
fact, it turns out that even with TVD limiting and arbitrarily small 
timesteps, it is not guaranteed in general that unphysical negative 
values are avoided. Nevertheless, it is possible to construct positiv¬ 
ity preserving FV and DG schemes, which are accurate at the same 
time. Remarkably, the latter means that high order accuracy can be 
retained in smooth solutions and furthermore the total mass, mo¬ 
mentum, and energy in each cell are conserved. For our DG code, 
we adopt the positivity limiting implementation following Zhang 
& Shu (2010). 

For constructing this limiter, a quadrature rule including the 
boundary points of the integration interval is needed. One possible 
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choice consists of m-point Gauss-Lobatto-Legendre (GLL) quadra¬ 
ture rules (Appendix C), which are exact for polynomials of degree 
k < 2m — 3. Let < ... < e [-1,1] be the one-dimensional 
Gauss quadrature points and < ... < 6 [-1,1] the GLL 

quadrature points. Quadrature rules for the domain [-1,-1]^ which 
include the interface Gauss quadrature points can be constructed by 
tensor products of Gauss and GLL quadrature points: 

:l<r<m,l<s<k+l,l<t<k+l], 

Sy = :l<r<k+l,l<s<m,l<t<k+l], 

S, = :l<r<fc-HLl<,?<A:+Ll<r<m|- 

(39) 


The union S ^SjUS^US^of these sets constitutes quadrature 
points of a rule which is exact for polynomials of degree k and 
furthermore contains all the points for which we calculate fluxes 
in equation (20). It can be shown that if the solution is positive 
at these union quadrature points, the solution averaged after one 
explicit Euler step will stay positive if a positivity preserving flux 
calculation and an adequate timestep are used. 

The positivity limiter is operated as follows. For a DG scheme 
with polynomials of maximum order k, choose the smallest integer 
m such that m > {k + 3)/2. Carry out the following computations 
for every cell K. First, determine the minimum density at the union 
quadrature points, 

pL = mmp'^(^)- (40) 


Use this minimum density for calculating the factor 


ef 



(41) 


where e ~ 10“*® is a small number representing the target floor for 
the positivity limiter. Then, modify the higher order terms of the 
density by multiplying the corresponding weights with the calcu¬ 
lated factor. 


j = 2,...,N(k). (42) 

At this point, the density at the union quadrature points is positive 

(> e), and as desired, the mean density = wf j and therefore the 
total mass has not been changed. 

In order to enforce pressure positivity at the union quadrature 
points ^ 6 5, we compute the factor 

e2'^ = minT'*(^), (43) 

with 

if p'^(£) > e 

^ ^ (44) 

such that - u'^)) = e, 

and limit all higher order terms (j >- 2) for all components of the 
state vector: 



j = 2,...,N(k), i=l,...,5. (45) 

The idea of the second case of equation (44) is to calculate a mod¬ 
ification factor T, for the higher order terms u^(^) - «**, such that 
the new pressure at the quadrature point equals e. Furthermore, this 
second case represents a quadratic equation in t,, which has to be 
solved carefully by minimizing round-off errors. In our implemen¬ 
tation, we improve the solution for t, iteratively with a small num¬ 
ber of Newton-Raphson iterations. 

The outlined method ensures positivity of pressure and density 
of the mean cell state after the subsequent timestep under several 




\ 
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Figure 2. Quadrature points of an AMR boundary cell in our 2D, third 
order {k = 2) version of the code. Interfaces with neighbouring cells on 
a finer AMR level are integrated with the quadrature points of the smaller 
cells. In this way no accuracy is lost at AMR boundaries and the order of 
our DG code is unaffected. 


conditions. First, a positivity preserving flux calculation has to be 
used, e.g. the Godunov flux or the Harten-Lax-van Leer flux is suit¬ 
able. Secondly, when adopting an RK method, it should be strong 
stability preserving (SSP); these methods are convex combinations 
of explicit Euler methods, and hence a number of relevant proper¬ 
ties and proofs valid for Euler’s method also hold for these higher 
order time integrators. Lastly, the local timestep for the DG scheme 
has to be set to 


X 1 w 1 

Ad = CFL • min i -, — } 

\2fe-Hl 2 J 

/Ivfl + c'^ Ivfl + c'^ Ivfl + c'^r* 

[ Axf ^ Axf Axf ) ’ ^ 

where fu}** is the first GLL weight. Compared to Zhang & Shu 
(2010) we obtain an additional factor of 1/2 since our reference 
domain is [-1,1] and therefore the sum of the GLL weights is 
2"Li = 2. The first GLL weight is = 1 for our second 
order DG (DG-2) method (k = 1) and Cj™ = 1/3 for our third order 
and fourth order method {k = 2,3). Depending on the order, the 
timestep with positivity limiting can be slightly more restrictive. 
We conclude this section by remarking that the positivity limiting 
procedure is a local operation and does not introduce additional 
inter-cell communication in the DG scheme. 


4 DG WITH AMR 

Many astrophysical applications involve a high dynamic range in 
spatial scales. In grid-based codes, this multi-scale challenge can be 
tackled with the AMR technique. In this approach, individual cells 
are split into a set of smaller subcells if appropriate (see Fig. 3 for 
a sketch of the two-dimensional case), thereby increasing the reso¬ 
lution locally. We adopt a tree-based AMR implementation where 
cubical cells are split into eight subcells when a refinement takes 
place. This allows a particularly flexible adaption to the required 
spatial resolution. 

Our implementation largely follows that in the RAMSES code 
(Teyssier 2002). The tree is organized into cells and nodes. The root 
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node encompasses the whole simulation domain and is designated 
as level I = 0. A node at level I always contains eight children, 
which can be either another node on a smaller level /' = / + 1 , or a 
leave cell. In this way, the mesh of leave cells is guaranteed to be 
volume filling. An example of such an AMR mesh in 2D is shown 
in Fig. 2. 

To distribute the work load onto many MPI processes, the tree 
is split into an upper part, referred to as ‘top tree’ in the following, 
and branches that hang below individual top tree nodes. Every MPI 
process stores a full copy of the top tree, whereas each of the lower 
branches is in principle stored only on one MPI rank. However, 
some of the branch data are replicated in the form of a ghost layer 
around each local tree to facilitate the exchange of boundary in¬ 
formation. The mesh can be refined and structured arbitrarily, with 
only one restriction. The level jump between a cell and its 26 neigh¬ 
bours has to be at most ± 1. This implies that a cell can either have 
one or four split cells as direct neighbours in a given direction. In 
order to fulfil this level jump condition, additional cells may have 
to be refined beyond those where the physical criterion demands a 
refinement. 

To simplify bookkeeping, we store for each cell and node the 
indices of the father node and the six neighbouring cells or nodes. 
In case of a split neighbour, the index of the node at the same level 
is stored instead. To make the mesh smoother and guarantee a buffer 
region around cells which get refined due to the physical refinement 
criterion, additional refinement layers are added as needed. In the 
AMR simulations presented in this work, we use one extra layer of 
refined cells around each cell flagged by the physical criterion for 
refinement. 


4.1 Refinement criterion 

Many different refinement strategies can be applied in an AMR 
code, for example, the refinement and derefinement criterion may 
aim to keep the mass per cell approximately constant. Another 
common strategy for refinement is to focus on interesting regions 
such as shocks, contact discontinuities, or fluid instabilities. In this 
work, we apply the mesh refinement simply at locations where the 
density gradient is steep. To be precise, cell K is refined if the fol¬ 
lowing criterion is met: 

maxfw^o, > a • w,. (47) 

Here, w^q, and are the density changes divided by V3 
along the x—, y—, and z- directions, respectively. The refinement is 
controlled by the target slope parameter Wt and a range factor a. We 
have introduced the latter with the purpose of avoiding an oscillat¬ 
ing refinement-derefinement behaviour. In this work, we adopt the 
values Wt = 0.01 and a = 1.1. The reason for using the density 
changes instead of the physical slopes is that in this way a runaway 
refinement can be avoided. 

Leave nodes of the AMR tree are kept refined if the following 
equation is true: 

,11 1 

maxfwjo.wjg, W 40 ) > - • Wt. (48) 

’ ' ’ a 

The weights on the left hand side are the leave node weights calcu¬ 
lated with a projection of the eight subcell solutions. If inequality 
(48) does not hold, the node gets derefined and the eight subcells 
are merged into one. 


c 

D 

A 

B 



Figure 3. Refinement (an'ow to the left) and coarsening (arrow to the right) 
of a cell in the 2D version of TENET. In both operations the solution on the 
new cell structure is inferred by an -projection of the current polynomial 
solution. In doing so, no information is lost in a refinement, and as much 
information as possible is retained in a derefinement. 


4.2 Mesh refinement 

Let K = {fif 6 [-1,1]^1 be the cell which we want to refine into 
eight subcells, and A, B, C, ... , H denote the daughter cells, as 
illustrated in Fig. 3 for the two-dimensional case. The refinement 
can be achieved without higher order information loss if the solu¬ 
tion polynomial of the coarse cell is correctly projected onto the 
subcells. In the following, we outline the procedure for obtaining 
the solution on subcell A = 6 [-1,0] x [-1,0] x [-1,0]), the 

calculations for the other subcells are done in an analogous way. 

The weights of the solution on subcell A are obtained by solv¬ 
ing the minimization problem 



N(k) 

where «*’ = wfcpf is the given solution on the coarse cell and 

1=1 

N(k} 

= 2 are the polynomials of the conserved variables on 

/=i 

cell A we are looking for. The minimization ansatz (49) is solved 
by projecting the solution of the coarse cell onto the basis functions 
of subcell A, 

< = ^ rdF, 7 = 1,..., N{k). (50) 

l-^l %Ja 

We can plug in the solution and move the time- but not space- 
dependent weights in front of the integral, 
m) , p 

(51) 

If we define the projection matrix 



and introduce for each conserved variable i = 1,..., 5 the weight 
vector w, = (wj,, W 2 ,,,..., w^;,)^, the weights of the subcell solution 
are simply given by the matrix-vector multiplications 

wf^P^wf i=l,...,5. (53) 

The matrix (52) can be computed exactly by transforming the inte¬ 
gral to the reference domain of cell A, 

(Pa);./ =^fff ^)0,(^.,f2,^3)df, (54) 

[-i.iT 

and applying a Gaussian quadrature rule with (k + 1)^ points. The 
projection integrals for the other subcells (B, C,.H) are given in 
Appendix F. The refinement matrices are the same for all the cells; 
we calculate them once in the initialization of our DG code. 
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4.3 Mesh dereflnement 

When the eight cells A, B, ..H are merged into one coarser cell 
K, some information is unavoidably lost. Nevertheless, in order to 
retain as much accuracy as possible, the derefinement should again 
be carried out by means of an -projection, 



Here, u^, ..., denote the given solutions on the cells to be 
derefined, and is the solution on the coarser cell we are looking 
for. The minimization problem is solved by the weights 



7 = l,...,iV(fc). (56) 

We insert the solutions on the cells A, B, ..H and use the defini¬ 
tion of the refinement matrices. 



ZiV® N(k) 


i=l,...,5 , (57) 

If we again use the vector wf consisting of the weights of the solu¬ 
tion for the conserved variable i, the weights of the solution on cell 
K can be computed with the transposed refinement matrices, 

•J-f = ^(P>f + P;'5'f + --- + P;'J’f), i=l,...,5. (58) 

Here we have used the fact that the cells to be merged have the 
same volume, |A| = | 6 | = ... = |//| = 

The refinement and derefinement matrices are orthogonal, i.e. 
P. 4 PJ = I. Thus, if a cell is refined and immediately thereafter dere¬ 
fined, the original solution on the coarse cell is restored. This prop¬ 
erty is desirable for achieving a stable AMR algorith; in our ap¬ 
proach it can be shown explicitly by inserting the subcell weights 
from equation (53) of every subcell in equation (58). While the re¬ 
finement of eight cells into a subcell preserves the exact shape of 
the solution, this is in general not the case for the derefinement. For 
example, if the solution is discontinuous across the eight subcells, 
it cannot be represented in a polynomial basis on the coarse cell. 
After derefining we limit the solution before further calculations 
are carried out, especially for asserting positivity. 

4.4 Limiting with AMR 

In the case of AMR boundary cells, the limiting procedure has to be 
well defined. For the limiting of cell K, the slope limiters described 
in Sections 3.1 and 3.2 need the average values of the neighbouring 
cells in each direction in order to adjust the slope of the conserved 
variables. However, if the cell neighbours are on a different AMR 
level, they are smaller or larger compared to the cell to limit, and a 
single neighbouring cell in a specific direction is not well defined. 

Fortunately, there is a straightforward way in DG to cope with 
this problem. If a neighbouring cell is on a different level, the poly¬ 
nomials of the neighbour are projected onto a volume which is 
equal to the volume of the cell to refine, see Fig. 4. This can be done 


with the usual refinement and derefinement operations as described 
in Sections 4.2 and 4.3. By doing so, the limiting at AMR bound¬ 
aries can be reduced to the limiting procedure on a regular grid. In 
AMR runs, we also slightly adjust the positivity limiting scheme. 
For cells which have neighbours on a higher level, the positivity 
limiter is not only applied at the locations of the union quadrature 
points (equation 39), but additionally at the quadrature points of 
the cell interfaces to smaller neighbours. By doing so, negative val¬ 
ues in the initial conditions of the Riemann problems are avoided, 
which have to be solved in the integration of these interfaces. 


4.5 Main simulation loop 

Our new TENET code has been developed as an extension of the 
AREPO code (Springel 2010), allowing us to reuse AREPO’s input- 
output routines, domain decomposition, basic tree infrastructure, 
neighbour search routines, and gravity solver. This also helps to 
make our DG scheme quickly applicable in many science appli¬ 
cations. We briefly discuss the high-level organization of our code 
and the structure of the main loop in a schematic way, focusing on 
the DG part. 

(i) Compute and store quadrature points and weights, 
basis function values, and projection matrices. 

(ii) Set the initial conditions by means of an -projection. 

(iii) Apply slope limiter. 

(iv) Apply positivity limiter. 

(v) While t < fmax: 

(a) Compute timestep. 

(b) For every RK stage: 

(1) Calculate Rj( of the differential equation (23) 

(inner and outer integral). 

(2) Update solution to the next RK stage. 

(3) Update node data. 

(4) Apply slope limiter. 

(5) Apply positivity limiter. 

(c) Do refinement and derefinement. 

(d) t = t + At. 

The basis function values and the quadrature data are com¬ 
puted in a general way for arbitrary spatial order as outlined in Ap¬ 
pendices A, B, and C. For the time integration, our code can be 
provided with a general Butcher tableau (Table Dl), specifying the 
desired RK method. By keeping these implementations general, the 
spatial order and time integrator can be changed conveniently. 

The first step of an RK stage consists of calculating Rk of the 
differential equation (23), including possible source terms. This in¬ 
volves computing the inner integral by looping over the cells and 
the outer integral by looping over the cell interfaces. After updating 
the solution weights for every cell, the hydrodynamical quantities 
on the AMR nodes have to be updated, such that they can be subse¬ 
quently accessed during the slope-limiting procedure (Section 4.4). 
After the RK step, all cells are checked for refinement and derefine¬ 
ment, and the mesh is adjusted accordingly. 
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Figure 4. Definition of neighbours in the slope-limiting procedure at AMR boundaries, shown for the 2D version of TENET. The neighbours of cell K on the 
left are on a finer level. Eor the slope limiting of cell K, the node weights are used, which are calculated by projecting the solutions of the subcells onto the 
encompassing node volume Wk- The right neighbour of cell K on the other hand is here a coarser cell; in this case, the solution of this neighbour has to be 
projected onto the smaller volume Ek- In the slope limiting routine of TENET, this is only done in a temporary fashion without actually modifying the mesh. 



Figure 5. Left-hand panel: El error norm as a function of linear resolution for the two-dimensional isentropic vortex test. Each data point corresponds to 
a simulation, and different colours indicate the different methods. We find a convergence rate as expected (dashed line) or slightly better for all schemes, 
indicating the correctness of our DG implementation. Middle panel: the same simulation enors as a function of degrees of freedom (DOE), which is an 
indicator for the memory requirements. For our test runs the higher order DG methods are more accurate at the same number of DOE. Right-hand panel: LI 
error norm versus the measured run time of the simulations. The second order FV implementation (FV-2) and the second order DG (DG-2) realization are 
approximately equally efficient in this test, i.e. a given precision can be obtained with a similar computational cost. In comparison, the higher order methods 
can easily be faster by more than an order of magnitude for this smooth problem. This illustrates the fact that an increase of order (p-refinement) of the 
numerical scheme can be remarkably more efficient than a simple increase of grid resolution (^-refinement). 


5 VALIDATION 

In this section, we discuss various test problems which are either 
standard tests or chosen for highlighting a specific feature of the 
DG method. For most of the test simulations, we compare the re¬ 
sults to a traditional second order FV method (FV-2). For definite¬ 
ness, we use the AREPO code to this end, with a fixed Cartesian 
grid and its standard solver as described in Springel (2010). The 
latter consists of a second order unsplit Godunov scheme with an 
exact Riemann solver and a non-TVD slope limiter. Recently, some 
modifications to the FV solver of AREPO have been introduced for 
improving its convergence properties (Pakmor et al. 2015) when 
the mesh is dynamic. However, for a fixed Cartesian grid, this does 


not make a difference and the old solver used here performs equally 
well. 


There are several important differences between FV and DG 
methods. In an FV scheme, the solution is represented by piece- 
wise constant states, whereas in DG the solution within every cell 
is a polynomial approximation. Moreover, in FV a reconstruction 
step has to be carried out in order to recreate higher order infor¬ 
mation. Once the states at the interfaces are calculated, numerical 
fluxes are computed and the mean cell values updated. In the DG 
method, no higher order information is discarded after completion 
of a step and therefore no subsequent reconstruction is needed. DG 
directly solves also for the higher order moments of the solution 
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and updates the weights of the basis functions in every cell accord¬ 
ingly. 

For all the DG tests presented in this section, we use an SSP 
RK time integrator of an order consistent with the spatial discretiza¬ 
tion, and the fluxes are calculated with the HLLC approximate Rie- 
mann solver. Furthermore, if not specified otherwise, we use for 
all tests the positivity limiter in combination with the character¬ 
istic limiter in the bounded version with the parameters /3 = 1 
and M = 0.5. Specifically, we only deviate from this configura¬ 
tion when we compare the limiting of the characteristic variables 
to the limiting of the conserved variables in the shock tube test in 
Section 5.2, and when the angular momentum conservation of our 
code is demonstrated in Section 5.5 with the cold Keplerian disc 
problem. The higher order efficiency of our DG code is quantified 
in the test problem of Section 5.1, and the 3D version of our code 
is tested in a Sedov-Taylor blast wave simulation in Section 5.3. In 
Section 5.4, we show that advection errors are much smaller for DG 
compared to FV-2. Finally, the AMR capabilities of TENET are il¬ 
lustrated with a high-resolution Kelvin-Helmholtz (KH) instability 
test in Section 5.6. 


5.1 Isentropic vortex 

In this first hydrodynamical test problem, we verify the correctness 
of our DG implementation by measuring the convergence rate to¬ 
wards the analytic solution for different orders of the scheme. Addi¬ 
tionally, we investigate the precision of the FV-2 and DG schemes 
as a function of computational cost. 

An elementary test for measuring the convergence of a hydro- 
dynamical scheme is the simulation of one-dimensional travelling 
waves at different resolutions (e.g. Stone et al. 2008). However, the 
DG scheme performs so well in this test, especially at higher or¬ 
der, that the accuracy is very quickly limited by machine precision, 
making the travelling sound wave test impractical for convergence 
studies of our DG implementation. We hence use a more demand¬ 
ing setup, the stationary and isentropic vortex in two dimensions 
(Yee et al. 1999). The primitive variables density, pressure, and ve¬ 
locities in the initial conditions are 


pir) = p^. 


p(r) = 


(y-i)y8- 


exp(l — r) 


vAr) = -(y-yo);:r exp 
In 

Vy{r) = (x-Xq)^ exp 
In 


2 

\-d 


with p = 5, and the adiabatic index y = 7/5. With this choice for 
the primitive variables, the centrifugal force at each point is exactly 
balanced by the pressure gradient pointing towards the centre of the 
vortex, yielding a time-invariant situation. 

The vortex is smooth and stationary, and every change during 
the time integration with our numerical schemes can be attributed 
mainly on numerical truncation errors. In order to break the spher¬ 
ical symmetry of the initial conditions, we additionally add a mild 
velocity boost of Vb = (1,1) everywhere. The simulation is carried 
out in the periodic domain (x,y) 6 [0,10]^ with the centre of the 
vortex at (xo,yo) = (5,5) and run until t = 10, corresponding to 
one box crossing of the vortex. We compare the obtained numeri¬ 
cal solution with the analytic solution, which is given by the initial 
conditions. The error is measured by means of the LI density norm. 


which we define for the FV method as 

= (59) 

i 

where N is the total number of cells, and p' and p® denote the den¬ 
sity in cell i in the final state and in the initial conditions, respec¬ 
tively. This norm should be preferred over the L2 norm, since it is 
more restrictive. 

For DG the error is inferred by calculating the integral 

LI = ^ \p'(x,y) - p°(x,y)\ dV, (60) 

with the density solution polynomial p'(x,y) and the analytic ex¬ 
pression p°(x,y) of the initial conditions. We compute the integral 
numerically with accuracy of order p + 2, where p is the order of 
the DG scheme. In this way, we assert that the error measurement 
cannot be dominated by errors in the numerical integration of the 
error norm. 

The result of our convergence study is shown in the left-hand 
panel of Figure 5. For every method we run the simulation with 
several resolutions, indicated by different symbols. The LI errors 
decrease with resolution and meet the expected convergence rates 
given by the dashed lines, pointing towards the correct implemen¬ 
tation of our DG scheme. At a given resolution, the DG-2 code 
achieves a higher precision compared to the FV-2 code, reflecting 
the larger number of flux calculations involved in DG. The con¬ 
vergence rates are equal, however, as expected. The higher order 
DG formulations show much smaller errors, which also drop more 
rapidly with increasing resolution. 

In the middle panel of Fig. 5, we show the same plot but sub¬ 
stitute the linear resolution on the horizontal axis with the number 
of degrees of freedom (DOF). This number is closely related to 
the used memory of the simulation. According to equation (10) the 
number of DOF per cell for 3D DG is l/6p(p l)(p 2). For two- 
dimensional simulations one obtains l/2p(p 1) DOF for every 
cell. The total number of DOF is thus given by N, 3N, 6N, and lOV 
for FV-2, DG-2, DG-3, and DG-4, respectively. At the same num¬ 
ber of DOF, as marked by the grey dashed lines, the FV-2 and DG-2 
code achieve a similar accuracy. Moreover, when using higher or¬ 
der DG methods the precision obtained per DOF is significantly 
increased in this test problem. 

In the right-hand panel of Fig. 5, we compare the efficien¬ 
cies of the different schemes by plotting the obtained precision as 
a function of the measured run time, which directly indicates the 
computational cost.* By doing so, we try to shed light on the ques¬ 
tion of the relative speed of the methods (or computational cost 
required) for a given precision. Or alternatively, this also shows the 
precision attained at a fixed computational cost. 

For the following discussion, we want to remark that the isen¬ 
tropic vortex is a smooth 2D test problem; slightly different con¬ 
clusions may well hold for problems involving shocks or contact 
discontinuities, as well as for 3D tests. The FV-2 method consumes 
less time than the DG-2 method when runs with equal resolution 
are compared. On the other hand, the error of the DG-2 scheme 
is smaller. As Fig. 5 shows, when both methods are quantitatively 


' The simulations were performed with four MPI-tasks on a conventional 
desktop machine, shown is the wall-clock time in seconds. A similar trend 
could also be observed for larger parallel environments; while DG is a 
higher order scheme, due to its discontinuous nature the amount of re¬ 
quired communication is comparable to that of a traditional second order 
FV method. 
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Figure 6. Left-hand panels: Sod shock tube problem calculated with DG-3 and different limiting approaches. We show the full polynomial DG solutions of 
the density, where the different colours correspond to different cells. The best result is achieved when the limiting is carried out based on the characteristic 
variables. As desired, the numerical solution is discontinuous and of low order at the shock. If the conserved variables are limited instead, the obtained solution 
is much less accurate, including numerical oscillations. The bottom-left panel shows the result without a slope limiter. The solution is of third order in every 
cell (parabolas) leading to an under- and overshooting at the shock as well as spurious oscillations. Right-hand panels: comparison of the mean cell values with 
the analytic solution for the characteristic valuables limiter. Advection errors wash out the solution at the contact rapidly, until it can be represented smoothly 
by polynomials. Overall, we find very good general agreement with the analytic solution, especially at the position of the shock. 
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compared at equal precision, the computational cost is essentially 
the same; hence, the overall efficiency of the two second order 
schemes is very similar for this test problem. Interestingly, the 64^- 
cell FV-2 run (yellow triangle) and the 32^-cell DG-2 simulation 
(green square) give an equally good result at almost identical run¬ 
time. 

However, the higher order schemes (DG-3, DG-4) show a sig¬ 
nificant improvement over the second order methods in terms of 
efficiency, i.e. prescribed target accuracies as indicated by the grey 
dashed lines can be reached much faster by them. In particular, the 
third order DG (DG-3) scheme performs clearly favourably over 
the second order scheme. The fourth order DG (DG-4) method uses 
the SSP RK-4 time integrator, which has already five stages, in con¬ 
trast to the time integrators of the lower order methods, which have 
the optimal number of p stages for order p. Moreover, the timestep 
for the higher order methods is smaller; according to equation (24), 
it is proportional to l/(2p - 1). Nevertheless, DG-4 is the most 
efficient method for this test. Consequently, for improving the cal¬ 
culation efficiency in smooth regions, the increase of the order of 
the scheme (p-refinement) should be preferred over the refinement 
of the underlying grid (/i-refinement). 

For FV methods, this principle is cumbersome to achieve from 
a programming point of view, because for every order a different 
reconstruction scheme has to be implemented, and moreover, there 
are no well-established standard approaches for implementing ar¬ 
bitrarily higher order FV methods. Higher order DG methods on 
the other hand can be implemented straightforwardly and in a uni¬ 
fied way. If the implementation is kept general, changing the order 
of the scheme merely consists of changing a simple parameter. In 
principle, it is also possible to do the p-refinement on the fly. In this 
way, identified smooth regions can be integrated with large cells 
and high order, whereas regions close to shocks and other disconti¬ 
nuities can be resolved with many cells and a lower order integra¬ 
tion scheme. 

5.2 Shock tube 

Due to the non-linearity of the Euler equations, the characteristic 
wave speeds of the solution depend on the solution itself. This 
dependence and the fact that the Euler equations do not diffuse 
momentum can lead to an inevitable wave steepening, ultimately 
producing wave breaking and the formation of mathematical dis¬ 
continuities from initially smooth states (Toro 2009, and references 
herein). Such hydrodynamic shocks are omnipresent in many hy- 
drodynamical simulations, particularly in astrophysics. The proper 
treatment of these shocks is hence a crucial component of any nu¬ 
merical scheme for obtaining accurate hydrodynamical solutions. 

In a standard FV method based on the RSA approach, the so¬ 
lution is averaged within every cell at the end of each timestep. The 
solution is then represented through piecewise constant states that 
can have arbitrarily large jumps at interfaces, consequently allow¬ 
ing shocks to be captured by construction. Similarly, the numerical 
solution in a DG scheme is discontinuous across cell interfaces, al¬ 
lowing for an accurate representation of hydrodynamic shocks. We 
demonstrate the shock-capturing abilities of our code with a clas¬ 
sical Sod shock tube problem (Sod 1978) in the two-dimensional 
domain (x,y) e [0,1]“ with 64^ cells. The initial conditions consist 
of a constant state on the left, pi = 1, pi = 1, and a constant state 
on the right, = 0.125, P; = 0.1, separated by a discontinuity at 
X = 0.5. The velocity is initially zero everywhere, and the adiabatic 
index is chosen to be y = 7/5. 

In Figure 6 we show the numerical and analytic solution at 


tend = 0.228 for DG-3, and for different limiting strategies. Several 
problems are apparent when a true discontinuity is approximated 
with higher order functions, i.e. the rapid convergence of the ap¬ 
proximation at the jump is lost, the accuracy around the discontinu¬ 
ity is reduced, and spurious oscillations are introduced (Gibbs phe¬ 
nomenon, see for example Arfken & Weber 2013). In the bottom- 
left panel of Fig. 6, we can clearly observe oscillations at both dis¬ 
continuities, the contact and the shock. Here the result is obtained 
without any slope limiter; merely the positivity limiter slightly ad¬ 
justs higher order terms such that negative density and pressure val¬ 
ues in the calculation are avoided. Nevertheless, the obtained solu¬ 
tion is accurate at large and has even the smallest LI error of the 
three approaches tested. 

The oscillations can however be reduced by limiting the sec¬ 
ond order terms of the solution, either expressed in the charac¬ 
teristic variables (upper-left panel) or in the conserved variables 
(middle-left panel). Strikingly, the numerical solution has a consid¬ 
erably higher quality when the characteristic variables are limited 
instead of the conserved ones, even though in both tests the same 
slope-limiting parameters are used (j8 = 1, M = 0.5). The for¬ 
mer gives an overall satisfying result with a discontinuous solution 
across the shock, as desired. The contact discontinuity is less sharp 
and smeared out over ~5 cells. This effect arises from advection 
errors inherent to grid codes and can hardly be avoided. Neverthe¬ 
less, the DG scheme produces less smearing compared to an FV 
method once the solution is smooth, see also Section 5.4. Hence, 
the higher order polynomials improve the sharpness of the contact. 
In the right-hand panels of Fig. 6, we compare the mean density, 
pressure, and velocity values of the numerical solution calculated 
with the characteristic limiter with the analytic solutions, and find 
good agreement. Especially, the shock is fitted very well by the 
mean values of the computed polynomials. To summarize, limiting 
of the characteristic variables is favourable over the limiting of the 
conserved variables. With the former, our DG code produces good 
results in the shock tube test thanks to its higher order nature, which 
clearly is an advantage also in this discontinuous problem. 

5.3 Sedov-Taylor blast wave 

The previous test involved a relatively weak shock. We now con¬ 
front our DG implementation with a strong spherical blast wave by 
injecting a large amount of thermal energy E into a point-like re¬ 
gion of uniform and ultra-cold gas of density p. The well-known 
analytic solution of this problem is self-similar and the radial po¬ 
sition of the shock front as a function of time is given by R{t) = 
RoiEfilpY^^ (see for example Padmanabhan 2000). The coefficient 
Rq depends on the geometry (ID, 2D, or 3D) and the adiabatic in¬ 
dex y; it can be obtained by numerically integrating the total energy 
inside the shock sphere. Under the assumption of a negligible back¬ 
ground pressure, the shock has a formally infinite Mach number 
with the maximum density compressionPmax/p = (y + l)/(y - !)■ 

Numerically, we set up this test with 64^ cells in a three- 
dimensional box (x,y,z) e [0,1]^ containing uniform gas with 
p = 1, p = 10“®, and y = 5/3. The gas is initially at rest and the 
thermal energy £ = 1 is injected into the eight central cells. Fig. 7 
shows numerical solutions obtained with FV-2 as well as with DG- 
2 and DG-3 at f = 0.05. The density slices in the top panels in¬ 
dicate very similar results; only the logarithmic colour-coding re¬ 
veals small differences between the methods in this test. Unlike La- 
grangian particle codes, Cartesian grid codes have preferred coor¬ 
dinate directions leading to asymmetries in the Sedov-Taylor blast 
wave problem, especially in the inner low-density region. FV-2 pro- 
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Figure 7. Three-dimensional Sedov-Taylor shock wave simulations at 1 = 0.05 calculated on a 64^-cell grid with FV-2 as well as with DG-2 and DG-3. The 
panels on top display central density slices (z = 0) on a logarithmic scale. In this test, Cartesian grid codes deviate noticeably from spherical symmetry in 
the central low-density region, and the shape of this asymmetry depends on details of the different methods. In the bottom panels, we compare the numerical 
results with the analytic solution. For each method, we plot the mean density of about every 200th cell, and for DG also the solution polynomials in the 
diagonal direction along the coordinates from (0.5,0.5,0.5) to (1,1,1). The obtained results are very similar in all three methods considered here, and DG does 
not give a significant improvement over FV for this test. 


duces a characteristic cross in the centre. In DG the asymmetries 
are significantly weaker; only a mild cross and squared-shape trace 
of the initial geometry of energy injection are visible. These effects 
can be further minimized by distributing the injected energy across 
more cells, for a point-like injection; however, they cannot be com¬ 
pletely avoided in grid codes. 

In the bottom panels of Fig. 7, we compare the numerical 
results to the analytic solution obtained with a code provided by 
Kamm & Timmes (2007); in particular we have Rq x 1.152 for 
7 = 5/3 in 3D. Due to the finite and fixed resolution of our grid 
codes, the numerical solutions do not fully reach the analytic peak 
compression of pn,ax = 4. More importantly, the DG method does 
not provide a visible improvement in this particular test, and fur¬ 
thermore, the results obtained with DG-2 and DG-3 are very sim¬ 
ilar. The reason behind these observations is the aggressive slope 
limiting due to the strong shock, suppressing the higher order poly¬ 
nomials in favour of avoiding over- and undershootings of the so¬ 
lution. We note that a better result could of course be achieved by 
refining the grid at the density jump and thereby increasing the ef¬ 
fective resolution. However, arguably the most important outcome 
of this test is that the DG scheme copes with an arbitrarily strong 
shock at least as well as a standard FV scheme, which is reassur¬ 


ing given that DG’s primary strength lies in the representation of 
smooth parts of the solution. 


5.4 Square advection 

SPH, moving mesh, and mesh-free approaches can be implemented 
such that the resulting numerical scheme is manifestly Galilean in¬ 
variant, implying that the accuracy of the numerical solution does 
not degrade if a boost velocity is added to the initial conditions. On 
the other hand, numerical methods on stationary grids are in general 
not manifestly Galilean invariant. Instead, they produce additional 
advection errors when a bulk velocity is added, which have the po¬ 
tential to significantly alter, e.g., the development of fluid instabili¬ 
ties (Springel 2010). While the numerical solution is still expected 
to converge towards the reference solution with increasing resolu¬ 
tion in this case (Robertson et al. 2010), this comes at the price of a 
higher computational cost in the form of a (substantial) increase of 
the number of cells and an accompanying reduction of the timestep 
size. 

We test the behaviour of the DG method when confronted with 
high advection velocities by simulating the supersonic motion of 
a square-shaped overdensity in hydrostatic equilibrium (following 
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Figure 8. Density maps and centred slices for the advection test with 64^ cells: a fluid with a squai'e-shaped overdensity in hydrostatic equilibrium is advected 
supersonically, crossing the periodic box several hundred times. The FV-2 method shows large advection errors in this test and the square is smeared out 
completely by f = 10. On the other hand, the advection errors in the DG method become small once the solution is smooth. DG-3 shows less dilfusion than 
DG-2 due to the higher order representation of the advected shape. The time evolution of the density en'ors for the three simulations is shown in Fig. 9. 


Hopkins 2014). For the initial conditions, we choose a y = 7/5 
fluid with p = 1, p = 2.5, Vx = 100, and Vy = 50 everywhere, except 
for a squared region in the centre of the two-dimensional periodic 
box, (x,y) e [0,1]“ with side lengths of 0.5, where the density is 
Ps = 4. The test is run with a resolution of 64^ cells until t = 10, 
corresponding to 1000 transitions of the square in the x-direction 
and 500 in the y-direction. 

In Fig. 8, we visually compare the results obtained with DG-2, 
DG-3, and the FV-2 scheme. Already at f = 1, the FV method has 
distorted the square to a round and asymmetric shape, and at r = 10, 


the numerical solution is completely smeared out’. In comparison, 
DG shows fewer advection errors, and a better approximation of the 
initial shape can be sustained for a longer time. Especially, the run 
with third order accuracy produces a satisfying result. Note that due 

^ In our FV-2 method, the overdensity moves slightly faster in the direction 
of advection and is clearly not centred any more at f = 10. We have inves¬ 
tigated this additional eiTor and found that its occuiTence depends on the 
choice of the slope limiter. Either way, the solution is completely washed 
out and the FV-2 method does not provide a satisfying result in this test. 
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Figure 9. Time evolution of the LI density error norm for the square advec- 
tion test. The grey lines indicate the initial slopes of error growth, and the 
encircled data points correspond to the plots shown in Fig. 8. Interestingly, 
FV-2 and DG-2 have the same polynomial growth, though the absolute error 
is smaller by a factor of k 1.5 for DG. On the other hand, the DG-3 method 
does not only decrease the absolute en'or in this test, but the error growth 
rate is also significantly smaller. 

to the high advection speed the CFL timestep is very small, leading 
to around 1.7 and 3.3 million timesteps at f = 10 with DG-2 and 
DG-3, respectively. 

In the bottom panels of Fig. 8 we show one-dimensional 
slices of the density solution polynomials from (x, y) = (0,0.5) 
to (x,y) = (1,0.5), and in grey the initial conditions as reference. 
With DG-2, an immediate over- and undershooting at the disconti¬ 
nuity can be observed owing to the adopted non-TVD slope limiter 
(j6 = 1, M = 0.5). More interestingly, while in the FV-2 method 
the solution is only washed out by diffusion, for DG-2 it is also 
steepened, which leads to a higher maximum density at t = 10 
than present in the initial conditions. This difference arises from 
the updating of the slopes in DG instead of reconstructing them, 
and moreover, it vanishes if we use TVD limiter parameters. With 
the latter the slopes are reduced aggressively, resulting in pure dif¬ 
fusion with a result similar to the FV-2 method. 

We quantify the quality of each scheme in this test by measur¬ 
ing the time evolution of the LI density error norm, the result of 
which is presented in Fig. 9. The FV-2 and DG-2 codes show the 
same initial polynomial error growth, oc t** ; however, the absolute 
error is much smaller for DG, i.e. the FV-2 error at t = 1 is reached 
with DG-2 only at a much later time, at t = 10. With the DG-3 code 
(quadratic basis functions), the error grows only as oc leading 
to an acceptably small error even until t = 70, which corresponds 
to 23 million timesteps. Also with this higher order approximation 
the solution is transformed to a smooth numerical solution, but the 
mean cell values are less modified compared to the second order 
code with linear basis functions. 

As demonstrated above, DG produces far smaller advection 
errors compared to an FV method. But how can this be understood, 
especially since the DG scheme is also not Galilean invariant? A 
powerful tool for studying the behaviour of discretizations in com¬ 


putational fluid dynamics is the modified equation analysis. Here, 
the discrete equations are expanded by means of a Taylor series, 
leading to a so-called modified equation which includes the target 
equation to solve and additional terms. For example, the modified 
equation of the first order upwind method for the scalar advection 
equation is an advection-diffusion equation (LeVeque 2002). The 
scheme solves the advection equation to first order by construction, 
but at the same time it effectively solves this advection-diffusion 
equation to second order accuracy, explaining the large diffusion 
errors when adopting this simple scheme, and pointing towards an 
explanation for the observed diffusion in our FV-2 method. Such 
an analysis proves to be more difficult for DG; nevertheless Moura 
et al. (2014) recently accomplished a modified equation analysis 
for the second order DG scheme applied to the linear advection 
equation. In this case, the modified equation consists of a physical 
mode and an unphysical mode moving at the wrong speed, which is 
however damped very quickly. For upwind fluxes the leading error 
term of the physical mode is diffusive and of third order, which is 
better than naively expected for a second order method, and this is 
likely one of the keys for the improved behaviour of DG we find. A 
heuristic argument for the superiority of DG in this test is that after 
an initial smoothing of the global numerical solution, it is not only 
continuous inside every cell but also at the cell interfaces. If the 
solution is perfectly continuous across cell interfaces, the left and 
right state entering the Riemann solver are identical, and the cal¬ 
culated flux is always related by a simple Galilei boost to the flux 
calculated in any other frame of reference. In this case, no mani¬ 
fest differences in the conserved quantities in a cell due to the flux 
calculation of the surface integral (20) can arise under a Galilean 
transformation, implying that advection errors must be minimal. 
Nevertheless, some small averaging errors will arise in practice if 
the current profile cannot be represented exactly at an arbitrary grid 
position with the given set of cell basis functions. 

5.5 Keplerian disc 

Rotating gas discs are omnipresent in our Universe, for example 
in galaxies, accretion discs around black holes, or protostellar and 
protoplanetary discs. Numerically, such objects are ideally mod¬ 
elled either with a Lagrangian method or with a grid code which 
operates with a suitably tailored mesh geometry and furthermore 
accounts for part of the angular rotation in the solver (Masset 2000). 
If a simulation contains however several rotating objects at differ¬ 
ent locations and with different orientations, as for example is the 
case in cosmological galaxy formation simulations, no preferable 
grid geometry exists. In this case, Cartesian grids are often adopted, 
optionally with AMR, making it much more challenging to avoid 
spurious errors in differentially rotating gas flows. 

These problems can be exposed by the demanding problem 
of a cold Keplerian disc, where real pressure forces are negligibly 
small and any spurious hydrodynamic forces from numerical errors 
result in readily visible perturbations of the system. We subject our 
DG code to this test following a similar setup as recently used in 
other works (Hopkins 2014; Pakmor et al. 2015). The ambient gas 
of the disc has negligible density and pressure, and is initially con¬ 
fined to lie between two radii. Every fluid element of the disc is 
rotating on a Keplerian orbit where the centrifugal forces are bal¬ 
anced by an external and static central gravitational potential. Gas 
self-gravity is not included. Analytically, the system is in perfect 
equilibrium and the initial state should not change in time. The dif¬ 
ficulty for hydro codes applied to this test lies in the lack of pressure 
support, as well as the differential rotation which leads to shearing 
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Figure 10. Density evolution in the cold Keplerian disc problem. The cen¬ 
trifugal force acting on the rotating disc is balanced by an external static 
central potential such that every fluid element is on a Keplerian orbit. This 
test is very challenging for Caifesian grid codes, as shear flows without 
pressure support are prone to numerical instabilities. Our FV-2 method is 
stable for around 10 orbits, at which point the disc gets disrupted eventu¬ 
ally. In contrast, without a slope limiter DG conserves angular momentum 
accurately and can handle this problem very well. 


flows. Both can trigger numerical instabilities and the eventual dis¬ 
ruption of the disc. In particular, it is clear that for codes which do 
not conserve angular momentum exactly, this point will inevitably 
be reached eventually. In SPH codes, angular momentum is con¬ 
served but angular momentum transport is caused by the use of ar¬ 
tificial viscosity, which typically is active at a small level in strong 
shear flows. Using an improved switch for the viscosity, however. 


the Keplerian disc problem can be integrated accurately (Cullen & 
Dehnen 2010). 

For definiteness, the initial conditions we use for our DG test 
are as follows. We use the computational domain (x,y) 6 [0,6]“ 
with periodic boundaries, and gas with an adiabatic index of y = 
5/3. The primitive variables are initially set to 


P = Po, 


p(r’) = 


Po 


PD-PO ( J 
Ar I 


(0.5 ■ 


PD 

pg-PD 

Ar 

Po 


-{2-^ ))+Po 


if r’ < 0.5 - f 


^))-Hpo ifO.5- 


f < P < 0.5 -b f 


if 0.5 -b f < P < 2 - f 
if2-f:<P<2-bf 
if P > 2-b f. 


vAx’,y') 


Vy(x',y') 


_y/p3/2 

if 0.5 - 2Ar < P < 2 -l- 2Ar 

0 

else. 

■y/p3/2 

if 0.5 - 2Ar < r' < 2 -H 2Ar 

0 

else. 


where the coordinates x', y', and r' are measured in a coordinate 
system with the origin at the centre (xo,yo) = (3,3) of the box. The 
values used for the background gas are po = 10“^ and po = 10“^. 
The disc has a density of po = 1 and a radial extent of [r^jn = 
0-5, tmax = 2], with a small transition region of width Ar = 0.1. We 
adopt a time-independent external acceleration a = -VO with the 
components 


aAx',y') 

aAx',y') 


-x’lr'^ 

-x7[p(p2 -b d)] 


if 0.5 - 0.5Ar < P 
else. 


-y7[P(p2 -b e7] 


if 0.5 - 0.5Ar < P 
else. 


where e - 0.25 smoothes the potential in the very inner regions in 
order to avoid a singularity there. The orbital period of the Keple¬ 
rian disc depends on the radius and is given by T = 

We evolve the system until 1 = 120, corresponding to around 
20 orbits at r = 1, and present the results in Fig. 10. When the FV-2 
method is used, the edges of the disc are washed out immediately 
but the disc is stable for around 10 orbits. Like the majority of FV 
methods in use, our scheme does not manifestly preserve angular 
momentum, resulting in secular integration errors and an eventual 
breakdown of the quasi-static solution. The number of orbits the 
disc survives depends mainly on how carefully the problem has 
been set up, as well as on the choice of slope limiter and resolution 
used. However, the disruption of the disc is inevitable and can only 
be delayed with adjustments of these parameters. 

On the other hand, DG schemes of second order and higher 
are inherently angular momentum conserving and can hence po¬ 
tentially handle this test problem much more accurately. To test for 
this, we have disabled the slope limiter of the DG scheme and use 
only the positivity limiter. This is because with our simple minmod 
limiter, angular momentum conservation is also violated and would 
result in a similar disruption as observed with FV-2. The construc¬ 
tion of an improved angular momentum preserving limiting scheme 
is hence desirable and worthwhile subject for future work. With the 
positivity limiter alone, the DG-2 scheme can integrate the disc sta¬ 
bly and gives good results at t = 120, corresponding to about 20 
orbits at r = 1. Merely two fine rings with a slightly higher density 
can be observed in the inner and outer regions of the disc, which 
can be attributed to the gentle overshooting of the solution at the 
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Figure 11. High-resolution KH simulation with DG-4 and AMR at time t = 0.8. The simulation starts with 64^ cells (level 6) and refines down to level 12, 
corresponding to an eflFective resolution of 4096^. We illustrate the AMR levels in Fig. 12. The mesh refinement approach renders it possible to resolve fractal 
stmctures created by secondary billows on top of the large-scale waves. Furthermore, as can be seen in the bottom panel, the solution within every cell contains 
rich information, consisting of a third order polynomial. A movie of the simulation until t = 2 may be accessed online: http: //youtu.be/cTRQP6DSaqA 
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AMR level 


Figure 12. Map of the AMR levels of the KH simulation at f = 0.8. We 
here refine in space where the density gradient is steep, allowing us to cap¬ 
ture interesting regions with high resolution and save computational time in 
places where the solution is smooth. 


rims of the disc. We have also carried out this simulation with the 
DG-3 code. In this case, the disc also does not break down, but the 
solution shows some mild oscillations with amplitude up to 20 per¬ 
cent of the initial density; without applying a limiter these cannot 
be suppressed. 


5.6 KH instability 

The KH instability is one of the most important fluid instabilities in 
astrophysics, for example, it plays an important role in the stripping 
of gas from galaxies falling into a galaxy cluster. The instability is 
triggered by shear flows, often also involving fluids with different 
densities, and grows exponentially until the primary billows break, 
subsequently leading to a turbulent mixing of the two phases. The 
KH instability can be investigated with initial conditions consisting 
either of a sharp surface between two phases or with a transition re¬ 
gion separating the layers smoothly. Analytic growth rates for the 
linear regime can be derived for both cases (Chandrasekhar 1961; 
Hendrix & Keppens 2014); however, the thickness of a smooth 
transition layer sets a limit on the minimum wavelength which can 
become unstable in the linear phase. This suppression is desired if 
one wants to compare growth rates inferred from simulations with 
the analytic growth rate (McNally et al. 2012), since the underlying 
mesh can trigger the instability of waves at the resolution scale due 
to noise, a numerical effect which does not vanish with increas¬ 
ing resolution. Nevertheless, we set up a sharp discontinuity and 
use the KH instability as a high-resolution test for the robustness of 
our AMR implementation and DG’s capabilities of capturing small- 
scale turbulent structures. 

The initial conditions are chosen as in Springel (2010); in the 


periodic box (x,y) e [ 0 , 1 ]^ the primitive variables at t = 0 are set 
to 


P = 2.5, 


p(x,y) = 


if 0.25 < y < 0.75 
else. 


Vxix,y) 


0.5 if 0.25 < y < 0.75 
-0.5 else. 


Vy(x,y) = Wo sin(4;rjc) 



(y-0.25)^ 

2cr^ 


+ exp 


(y - 0.75)2 


2cr2 


with Wo - Q.l, cr = 0.05/ V2, and the adiabatic index y = 7/5. The 
velocity perturbation in the y-direction supports the development 
of a specific single mode on large scales. We start with an initial 
resolution of 64^ cells (level 6 ) and refine where the density slope 
is steep, as described in Section 4.1. The maximum refinement level 
is set to 12, corresponding to an effective resolution of 4096^ cells. 
A sharp discontinuity between the two layers in combination with 
AMR leads to a fast transition into the non-linear regime and gen¬ 
erates secondary billows early on. 

We illustrate the state of the simulation at t = 0.8 in Fig. 11. 
The panel on the top left shows the density for the whole two- 
dimensional simulation box; in the following panels we zoom in 
repeatedly. The DG scheme shows only little diffusion and mixing 
thanks to the higher order and the avoidance of reconstruction steps, 
allowing the formation of very fine structures. Smaller KH instabil¬ 
ities arise on top of the large-scale waves demonstrating the frac¬ 
tal nature of this test problem. Self-similar instabilities are present 
across different scales, and an ending of this pattern is only set by 
the limited resolution. 

The adaptive mesh used by the calculation is overlaid in the 
bottom-right panel, revealing three different AMR levels in this 
subbox. The density gradients are well resolved with the finest 
AMR level (level 12), whereas smooth regions are on smaller lev¬ 
els. Furthermore, technical features of the AMR mesh structure can 
be seen in the plot: the level difference between neighbouring cells 
is at most one, and the mesh smoothing algorithm introduces an 
additional cell layer around physically refined cells. Fig. 12 shows 
a map of the AMR levels of the whole simulation box at f = 0.8, 
which can be directly compared to the top-left panel of Fig. 11. 
The number of cells at the displayed instance is around 1.8 million 
cells, which corresponds to about 10 percent of the effective level 
12 resolution 4096^, highlighting the efficiency gain possible with 
the AMR approach. Ideally, we would also like to use a correspond¬ 
ing adaptiveness in time by utilizing local timesteps, something that 
is planned as a code extension in future work. 


6 SUMMARY 

In this work, we have developed a 3D MPI-parallel higher order 
DG code with AMR for solving the Euler equations, called TENET, 
and investigated its performance in several astrophysically relevant 
hydrodynamic test problems. DG methods are comparatively new 
in astrophysics, despite a vast body of literature and long history of 
these approaches in the applied mathematics community. A number 
of highly attractive features of DG however suggest that it is timely 
to introduce these methods as standard tools in computational astro¬ 
physics. In particular, DG allows higher order to be reached with¬ 
out introducing communication beyond the immediate neighbour¬ 
ing cells, making it particularly well suited for parallel computing. 
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Also, the method is characterized by an enhanced continuous mem¬ 
ory access due to the increased number of DOF per cell, making 
it a better match for modem computer architectures, where float¬ 
ing point operations are “almost free” in comparison to slow and 
costly repeated memory accesses of small chunks. In addition, DG 
allows an easy and flexible way to reach arbitrarily higher order, 
quite unlike FV schemes. This makes it possible to also vary the 
order of a scheme in space, providing for additional flexibility in 
AMR approaches. 

Our tests furthermore clearly show that it is computationally 
worthwhile to employ higher order methods. While in general it 
depends on the problem which order is optimal, we have found in 
our tests that third and fourth order DG implementations are typ¬ 
ically computationally much more efficient compared with a sec¬ 
ond order code, at least in regions where the solution is reasonably 
smooth. Moreover, the numerical result is of higher quality also in 
non-smooth regions, especially shocks are represented very well, 
thanks to the discontinuous nature of the DG representation. They 
are typically captured within at most two to three cells. 

Nevertheless, close to shocks and discontinuities, possible 
limitations of the DG scheme become apparent. In favour of a 
higher overall accuracy, our DG scheme is not TVD, which can 
lead to over- and undershootings of the solution. Moreover, higher 
order methods tend to produce spurious oscillations at these loca¬ 
tions. The detection of troubled cells and the prevention of this un¬ 
wanted behaviour is an active topic of current research. The spuri¬ 
ous oscillations in our idealized test problems could be prevented 
effectively by limiting the slopes and discarding higher order terms 
of the solution when appropriate. For future real-world problems, a 
more sophisticated method is desirable. A very promising concept 
in this respect is the so-called /jp-adaption. In this approach, the 
grid is refined around shocks and discontinuities, while at the same 
time the degree of the polynomials is reduced, resulting in a lo¬ 
cally more robust scheme with less oscillations. Recently, Sonntag 
& Munz (2014) presented a practical realization of this concept by 
incorporating a subgrid TVD FV scheme into troubled DG cells. 
The subgrid resolution is chosen such that it has the same number 
of DOF as a DG cell. In this way, the timesteps of the FV cells are 
similar to the timesteps of the larger surrounding DG cells, and the 
scheme can be applied without significant computational overhead. 

The fundamental difference between DG and FV is that DG 
solves directly also for higher order moments of the solution, 
whereas in FV all higher order information is discarded in the im¬ 
plicit averaging at the end of each timestep, necessitating a sub¬ 
sequent reconstruction. This aspect of DG leads directly to two 
major advantages over traditional FV methods. First, DG produces 
significantly less advection errors, and furthermore, if the solution 
is smooth across cell interfaces, the numerical solution does not 
depend on the chosen Riemann solver. Secondly, DG inherently 
conserves not only mass, momentum, and energy, but also angular 
momentum. The conservation of angular momentum is very desir¬ 
able for many astrophysical applications, e.g. simulations involving 
discs, or objects like stars or molecular clouds in rotation. There is 
however one caveat which has to be kept in mind. The conservation 
can be spoiled by the limiting procedure, which is reminiscent of 
the situation in SPH, where angular momentum is spuriously trans¬ 
ported by artificial viscosity. Improving the limiter with the goal 
of global angular momentum conservation is hence desirable and a 
promising direction for future improvements of the DG implemen¬ 
tation. Finally, DG can also be comparatively easily generalized to 
the AMR technique, and importantly, this can be done without loss 
of accuracy, unlike in standard FV approaches. The higher order is 


formally preserved due to the usage of ‘hanging nodes’, which are 
the quadrature points of interfaces between cells of different sizes. 

The present work has focused on introducing our new code 
and highlighting its differences relative to FV schemes. Future 
developments will focus on more sophisticated shock capturing 
schemes, scaling improvements through Open-MP hybrid paral¬ 
lelization, as well as on incorporating magnetic fields and astro- 
physical source terms relevant in galaxy formation. The latter is 
greatly facilitated by the modular structure of TENET and its par¬ 
ent code AREPO. In a first science application of our DG code, we 
quantitatively analyse its capabilities of capturing driven turbulence 
structures (Bauer et al. 2015, in preparation). 
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APPENDIX A: LEGENDRE POLYNOMIALS 


We use Legendre Polynomials as basis functions for our discontin¬ 
uous Galerkin scheme. They can be calculated by solving Legen¬ 
dre’s differential equation: 


d 

d? 




-1- n{n -I- 1)P„(^) = 0, n € 


(Al) 


The first Legendre Polynomials are 

Po(f) = l 


P2{0 = - 1 ) 


Piik) = - 3^) 


(A2) 


P4(f) = ^(35^" - 30f + 3) Ps(^) = ^(63^5 - 70^' + 15^). 

O O 

They are pairwise orthogonal to each other, and moreover, we de 
fine the scaled polynomials 

P{^)„ = din + 1P(^)„ 


such that 

Jp,(k)Pj(k)dk = 


(A3) 


(A4) 


APPENDIX B: GAUSS-LEGENDRE QUADRATURE 


The numerical integration of a function / : [-1, +1] —> R- with the 
Gaussian quadrature rule of n points is given by a weighted sum, 



/(^)d^«2/(4°)<. 

<7=1 


(Bl) 


Here, e (-1,+1) are the Gaussian quadrature nodes and 
are the corresponding weights. To integrate a 2D function / : 
[-1,4-1]^ —> R the tensor product of the n Gauss points can be 
used, viz. 

X +1 /^d-I 

^ J ^ m,^2)dkid^2 

2 

- - Z/O-f • (B 2 ) 

9=1 r=l 4=1 

The «-point Gaussian quadrature rule is exact for polynomials of 
degree up to 2n - 1, and the one-dimensional nodes are given as the 
roots of the Legendre polynomial P„(^). We calculate them numer¬ 
ically by means of the Newton-Raphson method. As starting values 
of the iterative root finding approximate expressions for the roots 
can be used (see for example Lether & Wenston 1995), 
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1 




4^-1 
4n -I- 2 


q = I,.. .,n. 


(B3) 


Furthermore, the corresponding weights can be calculated as 
(Abramowitz & Stegun 2012) 


2 


q - 1,..., 71. 


(B4) 


With this approach, we can compute and store the necessary 
quadrature data in the initialization routine of our DG code for ar¬ 
bitrary spatial order. 


APPENDIX C: GAUSS-LEGENDRE-LOBATTO 
QUADRATURE 


Compared with Gaussian quadrature, the GLL quadrature rule is 
very similar but includes also the endpoints of the integration inter¬ 
val. Therefore, the ii-point GLL rule is exact for polynomials of de¬ 
gree In—3. The nodes are the roots of the function (l-^^)P^ |(^), 
and the corresponding weights are given by (Abramowitz & Stegun 
2012) 


71(71- l)P„_i(|,)2’ 


q = 2,... ,n - 1. 


(Cl) 


The weights of the endpoints are equal, Wi = oj,,, and the sum of 
the weights is 'Z,'q=i = 2. 


APPENDIX D: STRONG STABILITY PRESERVING 
RUNGE-KUTTA METHODS 

Let w(t} be the unknown scalar solution of the ordinary differential 
equation 

dw 

-h R(t, w) = 0. 

df 
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Table Dl. Runge-Kutta butcher tableau. 
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Table D2. SSP-RK2. 


c = and can be calculated as 
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where the definitions yi = y — (j) = jik, and p = l/( 2 c^) have 

been used. The eigenvector matrices for the flux Jacobians 
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Table D3. SSP-RK3. 
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The propagation of the numerical solution from timestep n to n + 1 
with an i-stage explicit RK method can be written as 




(D2) 


The factors b-, weight the sum over the solution derivatives fc, , which 
are evaluations of R{t, w), viz. 
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kj — —Rit^ + CjAf\ + AP{cijiki + rt/ 2^2 + ... + (D3) 

where the c, are nodes of the timestep interval. A RK scheme is 
fully specified by the weights hi, the nodes c,, and the RK matrix 
fl,y, it can be represented in compact form by means of a Butcher 
tableau (Table Dl). For our DG code we use strong stability pre¬ 
serving RK methods, which are convex combinations of forward 
Euler steps. In combination with a positivity preserving Riemann 
solver and the positivity limiter outlined in Section 3.4 negative 
pressure and density values can effectively be avoided in the hydro 
scheme. For the second order DG code we use SSP RK2 Heun’s 
method (D2), for our third order code the SSP RK3 from Table D3. 
These methods are optimal in the sense that the number of stages is 
equal to the order of the scheme. It can be proven that a fourth order 
method with this feature does not exist (Gottlieb & Shu 1998), we 
therefore adopt the 5-stage SSP RK 4 method tabulated in Table D4 
(Spiteri & Ruuth 2002). 
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respectively. 


APPENDIX F: REFINEMENT MATRICES 


APPENDIX E: EIGENVECTORS OF THE EULER 
EQUATIONS 

The Eigenvalues of the flux Jacobian Matrix Ai = are T,- = 

{vi - c, vi,Vi -I- c, Vi,vi). The corresponding Eigenvectors are the 
columns of the matrix 
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with the specific kinetic energy k = + + and the specific 

stagnation enthalpy h = c^/fy - 1) -r A:. The left eigenvectors of 
A| are the rows of the Matrix £.x = ■ -Cx is the linear transfor¬ 

mation operator from the conserved to the characteristic variables. 


Below, we list the refinement matrices for merging the cells 
into a coarser cell K = 6 [-1,1]^). We calcu¬ 

late the integrals by means of an exact numerical integration with 
{k+ 1 )^ quadrature points before the main loop of our code. 

Subcell A = {^1^ 6 [-1,0] X [-1,0] X [-1,0]) : 

(FI) 

[-i.ip 

Subcell B = (^1^ 6 [0,1] X [-1,0] x [-1,0]) : 

^j0,(^.,f2,^3)df. (F2) 

[-i.ip 

Subcell C = (^1^ 6 [-1,0] X [0,1] X [-1,0]) : 

(F3) 

M.lP 
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0 

0.39175222700392 
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0.47454236302687 

0.93501063100924 
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0.21766909633821 
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0.06796628370320 


0.36841059262959 
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0.11503469844438 


0.25189177424738 

0.20703489864929 


0.54497475021237 


0.14681187618661 


0.24848290924556 0.10425883036650 


0.27443890091960 


0.22600748319395 


Table D4. SSP-RK4. 


Subcell D = 6 [0,1] X [0,1] x [-1,0]) : 


(Pd)jJ = 
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Subcell £ = (6^6 [-1,0] X [-1,0] x [0,1]) : 

^ iff^)6(6,6,6)d6 (F5) 

n.ip 

Subcell £ = 6 [0,1] X [-1,0] x [0,1]) : 

[- 1 . 1 ]-’ 


Subcell G = (6^" 6 [-1,0] X [0,1] x [0,1]) : 

[- 1 , 1 ]’ 


(F7) 


Subcell H = 6 [0,1] X [0,1] x [0,1]) : 
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